Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Text-to-image synthesis method based on multi-level progressive resolution generative adversarial networks
XU Yining, HE Xiaohai, ZHANG Jin, QING Linbo
Journal of Computer Applications    2020, 40 (12): 3612-3617.   DOI: 10.11772/j.issn.1001-9081.2020040575
Abstract346)      PDF (1238KB)(348)       Save
To address the problem that the results of text-to-image synthesis tasks have wrong target structures and unclear image textures, a Multi-level Progressive Resolution Generative Adversarial Network (MPRGAN) model was proposed based on Attentional Generative Adversarial Network (AttnGAN). Firstly, a semantic separation-fusion generation module was used in low-resolution layer, and the text feature was separated into three feature vectors by the guidance of self-attention mechanism and the feature vectors were used to generate feature maps respectively. Then, the feature maps were fused into low-resolution map, and the mask images were used as semantic constraints to improve the stability of the low-resolution generator. Finally, the progressive resolution residual structure was adopted in high-resolution layers. At the same time, the word attention mechanism and pixel shuffle were combined to further improve the quality of the generated images. Experimental results showed that, the Inception Score (IS) of the proposed model reaches 4.70 and 3.53 respectively on datasets of Caltech-UCSD Birds-200-2011 (CUB-200-2011) and 102 category flower dataset (Oxford-102), which are 7.80% and 3.82% higher than those of AttnGAN, respectively. The MPRGAN model can solve the instability problem of structure generation to a certain extent, and the images generated by the proposed model is closer to the real images.
Reference | Related Articles | Metrics
Wavelet domain distributed depth map video coding based on non-uniform quantization
CHEN Zhenzhen, QING Linbo, HE Xiaohai, WANG Yun
Journal of Computer Applications    2016, 36 (4): 1080-1084.   DOI: 10.11772/j.issn.1001-9081.2016.04.1080
Abstract488)      PDF (734KB)(388)       Save
In order to improve the decoding quality of depth map video in Distributed Multi-view Video plus Depth (DMVD) coding, a new non-uniform quantization scheme based on the sub-band layer and sub-band coefficients was proposed in wavelet domain Distributed Video Coding (DVC). The main idea was allocating more bits to pixels belong to the edge of depth map and consequently improving the quality of the depth map. According to the distribution characteristics of the wavelet coefficients of depth map, the low frequency wavelet coefficients of layer- N kept the uniform quantization scheme, while the high frequency wavelet coefficients of all layers used the non-uniform quantization scheme. For the high frequency wavelet coefficients around "0", larger quantization step was adopted. As the amplitude of the high frequency wavelet coefficients increased, the quantization step decreased, with finer quantization and the quality of the edge was improved consequently. The experimental results show that, for "Dancer" and "PoznanHall2" depth sequence with more edges, the proposed scheme can achieve up to 1.2 dB in terms of the Rate-Distortion (R-D) performance improvement by improving the quality of edges; for "Newspaper" and "Balloons" depth sequences with less edges, the proposed scheme can still get 0.3 dB of the R-D performance.
Reference | Related Articles | Metrics